266 research outputs found

    Worst-case Complexity of Cyclic Coordinate Descent: O(n2)O(n^2) Gap with Randomized Version

    Full text link
    This paper concerns the worst-case complexity of cyclic coordinate descent (C-CD) for minimizing a convex quadratic function, which is equivalent to Gauss-Seidel method and can be transformed to Kaczmarz method and projection onto convex sets (POCS). We observe that the known provable complexity of C-CD can be O(n2)O(n^2) times slower than randomized coordinate descent (R-CD), but no example was rigorously proven to exhibit such a large gap. In this paper we show that the gap indeed exists. We prove that there exists an example for which C-CD takes at least O(n4κCDlog1ϵ)O(n^4 \kappa_{\text{CD}} \log\frac{1}{\epsilon}) operations, where κCD\kappa_{\text{CD}} is related to Demmel's condition number and it determines the convergence rate of R-CD. It implies that in the worst case C-CD can indeed be O(n2)O(n^2) times slower than R-CD, which has complexity O(n2κCDlog1ϵ)O( n^2 \kappa_{\text{CD}} \log\frac{1}{\epsilon}). Note that for this example, the gap exists for any fixed update order, not just a particular order. Based on the example, we establish several almost tight complexity bounds of C-CD for quadratic problems. One difficulty with the analysis is that the spectral radius of a non-symmetric iteration matrix does not necessarily constitute a \textit{lower bound} for the convergence rate. An immediate consequence is that for Gauss-Seidel method, Kaczmarz method and POCS, there is also an O(n2)O(n^2) gap between the cyclic versions and randomized versions (for solving linear systems). We also show that the classical convergence rate of POCS by Smith, Solmon and Wager [1] is always worse and sometimes can be infinitely times worse than our bound.Comment: 47 pages. Add a few tables to summarize the main convergence rates; add comparison with classical POCS bound; add discussions on another exampl

    On Solving Fewnomials Over Intervals in Fewnomial Time

    Full text link
    Let f be a degree D univariate polynomial with real coefficients and exactly m monomial terms. We show that in the special case m=3 we can approximate within eps all the roots of f in the interval [0,R] using just O(log(D)log(Dlog(R/eps))) arithmetic operations. In particular, we can count the number of roots in any bounded interval using just O(log^2 D) arithmetic operations. Our speed-ups are significant and near-optimal: The asymptotically sharpest previous complexity upper bounds for both problems were super-linear in D, while our algorithm has complexity close to the respective complexity lower bounds. We also discuss conditions under which our algorithms can be extended to general m, and a connection to a real analogue of Smale's 17th Problem.Comment: 19 pages, 1 encapsulated postscript figure. Major revision correcting many typos and minor errors. Additional discussion on connection to Smale's 17th Problem and some new references are include

    Market Making with Model Uncertainty

    Full text link
    Pari-mutuel markets are trading platforms through which the common market maker simultaneously clears multiple contingent claims markets. This market has several distinctive properties that began attracting the attention of the financial industry in the 2000s. For example, the platform aggregates liquidity from the individual contingent claims market into the common pool while shielding the market maker from potential financial loss. The contribution of this paper is two-fold. First, we provide a new economic interpretation of the market-clearing strategy of a pari-mutuel market that is well known in the literature. The pari-mutuel auctioneer is shown to be equivalent to the market maker with extreme ambiguity aversion for the future contingent event. Second, based on this theoretical understanding, we present a new market-clearing algorithm called the Knightian Pari-mutuel Mechanism (KPM). The KPM retains many interesting properties of pari-mutuel markets while explicitly controlling for the market maker's ambiguity aversion. In addition, the KPM is computationally efficient in that it is solvable in polynomial time

    Managing Randomization in the Multi-Block Alternating Direction Method of Multipliers for Quadratic Optimization

    Full text link
    The Alternating Direction Method of Multipliers (ADMM) has gained a lot of attention for solving large-scale and objective-separable constrained optimization. However, the two-block variable structure of the ADMM still limits the practical computational efficiency of the method, because one big matrix factorization is needed at least once even for linear and convex quadratic programming. This drawback may be overcome by enforcing a multi-block structure of the decision variables in the original optimization problem. Unfortunately, the multi-block ADMM, with more than two blocks, is not guaranteed to be convergent. On the other hand, two positive developments have been made: first, if in each cyclic loop one randomly permutes the updating order of the multiple blocks, then the method converges in expectation for solving any system of linear equations with any number of blocks. Secondly, such a randomly permuted ADMM also works for equality-constrained convex quadratic programming even when the objective function is not separable. The goal of this paper is twofold. First, we add more randomness into the ADMM by developing a randomly assembled cyclic ADMM (RAC-ADMM) where the decision variables in each block are randomly assembled. We discuss the theoretical properties of RAC-ADMM and show when random assembling helps and when it hurts, and develop a criterion to guarantee that it converges almost surely. Secondly, using the theoretical guidance on RAC-ADMM, we conduct multiple numerical tests on solving both randomly generated and large-scale benchmark quadratic optimization problems, which include continuous, and binary graph-partition and quadratic assignment, and selected machine learning problems. Our numerical tests show that the RAC-ADMM, with a variable-grouping strategy, could significantly improve the computation efficiency on solving most quadratic optimization problems.Comment: Expanded and streamlined theoretical sections. Added comparisons with other multi-block ADMM variants. Updated Computational Studies Section on continuous problems -- reporting primal and dual residuals instead of objective value gap. Added selected machine learning problems (ElasticNet/Lasso and Support Vector Machine) to Computational Studies Sectio

    Stochastic Combinatorial Optimization under Probabilistic Constraints

    Full text link
    In this paper, we present approximation algorithms for combinatorial optimization problems under probabilistic constraints. Specifically, we focus on stochastic variants of two important combinatorial optimization problems: the k-center problem and the set cover problem, with uncertainty characterized by a probability distribution over set of points or elements to be covered. We consider these problems under adaptive and non-adaptive settings, and present efficient approximation algorithms for the case when underlying distribution is a product distribution. In contrast to the expected cost model prevalent in stochastic optimization literature, our problem definitions support restrictions on the probability distributions of the total costs, via incorporating constraints that bound the probability with which the incurred costs may exceed a given threshold

    Likelihood Robust Optimization for Data-driven Problems

    Full text link
    We consider optimal decision-making problems in an uncertain environment. In particular, we consider the case in which the distribution of the input is unknown, yet there is abundant historical data drawn from the distribution. In this paper, we propose a new type of distributionally robust optimization model called the likelihood robust optimization (LRO) model for this class of problems. In contrast to previous work on distributionally robust optimization that focuses on certain parameters (e.g., mean, variance, etc.) of the input distribution, we exploit the historical data and define the accessible distribution set to contain only those distributions that make the observed data achieve a certain level of likelihood. Then we formulate the targeting problem as one of optimizing the expected value of the objective function under the worst-case distribution in that set. Our model avoids the over-conservativeness of some prior robust approaches by ruling out unrealistic distributions while maintaining robustness of the solution for any statistically likely outcomes. We present statistical analyses of our model using Bayesian statistics and empirical likelihood theory. Specifically, we prove the asymptotic behavior of our distribution set and establish the relationship between our model and other distributionally robust models. To test the performance of our model, we apply it to the newsvendor problem and the portfolio selection problem. The test results show that the solutions of our model indeed have desirable performance

    Close the Gaps: A Learning-while-Doing Algorithm for a Class of Single-Product Revenue Management Problems

    Full text link
    We consider a retailer selling a single product with limited on-hand inventory over a finite selling season. Customer demand arrives according to a Poisson process, the rate of which is influenced by a single action taken by the retailer (such as price adjustment, sales commission, advertisement intensity, etc.). The relationship between the action and the demand rate is not known in advance. However, the retailer is able to learn the optimal action "on the fly" as she maximizes her total expected revenue based on the observed demand reactions. Using the pricing problem as an example, we propose a dynamic "learning-while-doing" algorithm that only involves function value estimation to achieve a near-optimal performance. Our algorithm employs a series of shrinking price intervals and iteratively tests prices within that interval using a set of carefully chosen parameters. We prove that the convergence rate of our algorithm is among the fastest of all possible algorithms in terms of asymptotic "regret" (the relative loss comparing to the full information optimal solution). Our result closes the performance gaps between parametric and non-parametric learning and between a post-price mechanism and a customer-bidding mechanism. Important managerial insight from this research is that the values of information on both the parametric form of the demand function as well as each customer's exact reservation price are less important than prior literature suggests. Our results also suggest that firms would be better off to perform dynamic learning and action concurrently rather than sequentially

    On the behavior of Lagrange multipliers in convex and non-convex infeasible interior point methods

    Full text link
    We analyze sequences generated by interior point methods (IPMs) in convex and nonconvex settings. We prove that moving the primal feasibility at the same rate as the barrier parameter μ\mu ensures the Lagrange multiplier sequence remains bounded, provided the limit point of the primal sequence has a Lagrange multiplier. This result does not require constraint qualifications. We also guarantee the IPM finds a solution satisfying strict complementarity if one exists. On the other hand, if the primal feasibility is reduced too slowly, then the algorithm converges to a point of minimal complementarity; if the primal feasibility is reduced too quickly and the set of Lagrange multipliers is unbounded, then the norm of the Lagrange multiplier tends to infinity. Our theory has important implications for the design of IPMs. Specifically, we show that IPOPT, an algorithm that does not carefully control primal feasibility has practical issues with the dual multipliers values growing to unnecessarily large values. Conversely, the one-phase IPM of \citet*{hinder2018one}, an algorithm that controls primal feasibility as our theory suggests, has no such issue

    Computations and Complexities of Tarski's Fixed Points and Supermodular Games

    Full text link
    We consider two models of computation for Tarski's order preserving function f related to fixed points in a complete lattice: the oracle function model and the polynomial function model. In both models, we find the first polynomial time algorithm for finding a Tarski's fixed point. In addition, we provide a matching oracle bound for determining the uniqueness in the oracle function model and prove it is Co-NP hard in the polynomial function model. The existence of the pure Nash equilibrium in supermodular games is proved by Tarski's fixed point theorem. Exploring the difference between supermodular games and Tarski's fixed point, we also develop the computational results for finding one pure Nash equilibrium and determining the uniqueness of the equilibrium in supermodular games

    A Dynamic Near-Optimal Algorithm for Online Linear Programming

    Full text link
    A natural optimization model that formulates many online resource allocation and revenue management problems is the online linear program (LP) in which the constraint matrix is revealed column by column along with the corresponding objective coefficient. In such a model, a decision variable has to be set each time a column is revealed without observing the future inputs and the goal is to maximize the overall objective function. In this paper, we provide a near-optimal algorithm for this general class of online problems under the assumption of random order of arrival and some mild conditions on the size of the LP right-hand-side input. Specifically, our learning-based algorithm works by dynamically updating a threshold price vector at geometric time intervals, where the dual prices learned from the revealed columns in the previous period are used to determine the sequential decisions in the current period. Due to the feature of dynamic learning, the competitiveness of our algorithm improves over the past study of the same problem. We also present a worst-case example showing that the performance of our algorithm is near-optimal
    corecore